1.函数表

函数	作用
match	在字符串起始位置匹配，如果没有匹配到返回None。
fullmatch	用表达式匹配所有字符串
search	匹配任意位置的字符串
sub	替换字符串中匹配成功项。
subn	同上，返回替换次数
split	把查找到的字符串分割，并且打包成列表返回
findall	匹配字符串中所有匹配的项，打包成列表返回
finditer	同上，但是返回的是迭代器
compile	编译正则表达式(pattern)，生成一个正则表达式对象(RegexObject)，仅供match()和search()函数使用。
purge	清除缓存
escape	除了ASCII字母，数字和以外，转义模式中的所有字符

2.可选标志

修饰符	描述
re.I	使匹配对大小写不敏感
re.L	做本地化识别（locale-aware）匹配
re.M	多行匹配，会影响 ^ 和 $
re.S	使`.`匹配包括换行在内的所有字符
re.U	根据Unicode字符集解析字符。会标志影响 \w, \W, \b, \B.

3.re.match()

在字符串开头匹配。
函数原型：re.match(pattern, string, flags=0)
匹配成功：返回匹配的对象
匹配失败：返回None

参数	意义
pattern	匹配的正则表达式
string	要匹配的字符串。
flags	可选标志位，用于控制正则表达式的匹配方式

# 在字符串开头匹配到返回一个对象
print(re.match('blog', 'blog.ojbkfeng.cn'))
# <_sre.SRE_Match object; span=(0, 4), match='blog'>

# 在开头处没匹配到返回None
print(re.match('cn', 'blog.ojbkfeng.cn'))
# None

print(re.match('blog', 'blog.ojbkfeng.cn').span())
# (0, 4)

strings = "what the fuck ???"
reg = re.match(r'(.*) the (.*?) .*', strings, re.M|re.I)
if reg:
   print(reg.group())
   # what the fuck ???
   print(reg.group(1)) # 匹配到的第一个
   # what
   print(reg.group(2)) # 第二个
   # fuck
else:
   print( "No match!!")

4.re.search()

在整个字符串匹配，并返回成功的第一个。
函数原型：re.search(pattern, string, flags=0)
匹配成功：返回匹配的对象。
匹配失败：返回None。
参数同match(), 用法相同。

5.re.sub()

替换字符串中匹配成功项。
函数原型：re.sub(pattern, repl, string, count=0, flags=0)

参数	意义
pattern	正则中的模式字符串。
repl	替换的字符串，也可为一个函数。
string	要被查找替换的原始字符串。
count	模式匹配后替换的最大次数，默认 0 表示替换所有的匹配。

1
2
3

strings = "188-7438-6478"
num = re.sub(r'\D', "", strings)
print(num) # 18874386478

6.re.compile()

编译正则表达式(pattern)，生成一个正则表达式对象(RegexObject)，仅供match()和search()函数使用。
函数原型：re.compile(pattern[, flags])
pattern : 一个字符串形式的正则表达式
flags : 可选标志，文章上面有参数表。

strings = "188-7438-6478"
reObj = re.compile(r'\D')
num = re.sub(reObj, "", strings)
print(num) # 18874386478

7.re.split()

按照能够匹配的子串将字符串分割后返回列表。
函数原型re.split(pattern, string[, maxsplit=0, flags=0])
maxsplit：分隔次数，maxsplit=1 分隔一次，默认为 0，不限制次数
其余参数与match()相同。

1 2	re.split('\W+', 'ojbk, ojbk, ojbkfeng.') # ['ojbk', 'ojbk', 'ojbkfeng', '']

8.findall

在字符串中找到正则表达式所匹配的所有子串，并返回一个列表，如果没有找到匹配的，则返回空列表。

跟match()和search()不同的是findall()匹配所有出现过的字符串，match和search只匹配一次。
函数原型：findall(string[, pos[, endpos]])

参数	意义
string	待匹配的字符串。
pos	可选参数，指定字符串的起始位置，默认为 0。
endpos	可选参数，指定字符串的结束位置，默认为字符串的长度。

9.re.finditer()

跟findall()一样，区别是返回一个迭代器(iter)。
函数原型：re.finditer(pattern, string, flags=0)

Python re 模块

2019-01-26
正则表达式
Python

Python re 模块

1.函数表

2.可选标志

3.re.match()

4.re.search()

5.re.sub()

6.re.compile()

7.re.split()

8.findall

9.re.finditer()

Python re 模块

1.函数表

2.可选标志

3.re.match()

4.re.search()

5.re.sub()

6.re.compile()

7.re.split()

8.findall

9.re.finditer()

谢谢大爷~